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I, Tbrahim Sezan, hereby declare as follows: 

1 . My residence address is 221 3 NW Hood Drive, Camas, Washington 98607. 

2. I, together with Petrus Van Beek and George Borden IV, am a co-inventor of the 
present U.S. Patent Application Serial No. 09/882,416, which claims the benefit of the filing date 
of provisional application Serial No. 60/214,878 filed on June 28, 2000. 1 am also one of the 
joint inventors of U.S. Patent No. 6,070,167, which was cited by the Examiner as prior art in a 
rejection under 35 U.S.C. § 103(a). That patent issued on, and has an effective date of, May 30, 
2000. 

3. Based upon my knowledge, and co-mventorship, of the subject matter of the 
present U.S. Patent Application Serial No. 09/882,416, the subject matter disclosed in, and 
claimed by, the present application was reduced to practice on a dale prior to May 30, 2000, the 
effective date of Qian ct al„ U.S. Patent No. 6,070,167. Attached to this declaration is a redacted 
version of an invention disclosure statement that describes the subject matter claimed by the 
present application. That invention disclosure statement details the conception date of the 
invention(s) claimed by the present application, along with the dates at which the invention was 
described in writing, and fully disclosed to a patent attorney involved in preparing the present 
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United Stale Code and that such willful, false statements may jeopardize the validity of the 
application or any patent issuing thereon. 
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3, Project & Supervisor: 

Supervisor's Name: 
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4. Conception of the Invention: 

Date Conceived: 
Date of first Written Description: 
Notebook & Page No. or File Archive: 
Date first explained to others (whom?): 
Planned Application for the Invention: 
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5. Construction & Test of First Prototype Embodying the Invention: 

Date First Prototype Completed: N/A 

Part Number/Product Description: 

Date of First Successful Test: 

Successful Operation Witnessed By: 
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6. Public Disclosure of Invention (Presentation at public meeting or publication): 
(NOTE: Patent Application MUST be filed prior to any public disclosure.): 

Date of First Public Disclosure: 

Setting (Conference/Journal Name): 

Title of Paper or Presentation: 

Type of Disclosure (Written/Verbal): 

Does Data Sheet or Application Note Disclose the Invention (when)? 
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7. What is the field of the invention (Invention relates to...): 

The invention relates to metadata embedding to a JPEG2000 file by taking advantage of the file format 
specification in Part 1 of the JPEG2000 standard, and by making use of the MPEG-7 standard so that the 
syntax and semantics of the embedded metadata is compliant with an international standard and thus 
such information can be exchanged and consumed by a wide range of different applications. 

8. What is the problem solved by your invention? How is it solved in the prior art (do not put 
search pages here)? 

The JPEG2000 committee has recently completed a Final Committee Draft (FCD) of their Part 1 
standardization. The JPEG2000 FCD includes a file format specification (JP2 file format) where the file 
format ^encapsulates the coded image bitstream as well as metadata. Metadata may be contained in 
boxes" tha t store information expressed in Extendable Markup Language (XML) (the so-called "XML 
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boxes ), and boxes that store binary data where the data boxes (the so-called "UUID boxes" where UUID 
stands for Universal Unique Identifier) can be uniquely identified. The UUID boxes allow vendors to store 
data in the file without conflicting with other vendors. 

Through the use of the XML and UUID boxes, The JPEG2000 file format allows the storage of metadata 
about the image within the same file containing the coded image data itself. (Complete specification of the 
JPEG2000 file format including the definition of all boxes can be found in [1] "ISO/IEC JTC 1/SC 29/WG1 
N1646, JPEG 2000 Part I Final Committee Draft Version 1.0, March 2000.) 

The JPEG2000 specification does not however define the syntax and semantics for the metadata that can 
be placed in the XML and/or UUID boxes in the JP2 file. There is therefore a need for specification of 
syntax and semantics for the contents of these boxes, preferably a standardized syntax and semantics 
specification that will permit the exchangeability of the metadata contents contained in these boxes. The 
current invention provides syntax and semantics specification for metadata that' may populate the XML 
and UUID boxes. 

In parallel to the JPEG2000 standardization activity, MPEG-7 is working towards standardization of media 
content descriptions - a structured set of descriptors and metadata expressed in XML It is therefore 
possible to include the compressed image data and the descriptions 1 about the image content into a single 
file - a JPEG2000 file - using the JPEG2000 file format specification, particularly the XML box structure. 

Here, we focus on pariicular types of me tadata for a parjicular/class of applications. We consider the 
problem of defining and identifying multiple \hot spots V e.o.. bounding boxes) in the image and associating 
metadata and information with these hot iPBIs, where m^tad^ta are typically related to the objects (or 
image regions) that are highlighted by the hot spots. We ishdw how the XML box, or the XML and the 
UUID boxes, in the JPEG2000 file format can be utilized {o/store descriptions and data that define and 
identify the hot spot regions as well as data associated witf these regions, such as object specific URL 
links, voice annotation, and textual annotation. One of ma^V applications of such data is user interaction 
with images where users interactively discover and consutne iriformation that relate to the contents of the 
image. j \ 

/ ! > i 

Although our focus here is MPEG-7 compliant XMt documents (i.e., XML documents that are valid 
according to MPEG-7 description schemes which are expressed in terms of the XML Schema), the XML 
box in a JPEG2000 file may contain XML documents that ar£ not necessarily MPEG-7 compliant. The 
schema of the document and the semantics may vary from! application to application. The validating XML 
Schema may be stored in the JPEG2000 file, for example in the UUID box. In that case a JP2tfile 
becomes self-contained. /\ — — 

/ / ' 

In US 60701 67 we have disclosed a 2-level hierarchica/ data structure for object-based information 
embedding to images, with particular focus on defining hjpj/spots and embedding various typoc nf fat* \ n 
SLA 107, we have disclosed a proprietary file format specification, called JFIF+, ieTwi5Sdl&rblSWc{g 
based information to JPEG images. A L\ H v£? LSJrUSJ^ 

/ \ P|r I 

9. How is your solution different from the prior art (one paragraph or list)? 

. . till J 11^ 

In this invention, we consider the JPEG2000 file format specification. We take acMijra3s.qfcft^tltt«RTHEMT 

JPEG2000 file format allows for an XML box structure for metadata embedding. W^lffis^Spntate" ' 

the XML boxes with MPEG-7 descriptions of the image content, which are expressed in XML. This 
solution is advantageous with respect to that proposed in SLA107 [8] because it is based on an 
international standard. 
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In particular we focus on describing and id^ftifying multiple image regions and associating metadata with 
these regions by making use of the XMlbd^ mechanism defined by JPEG2000. 

To the best of our knowledge bringing thfe two international standards (MPEG-7 and JPEG2000) together 
in this type of application is novel. L/ 

10. Please give a detailed description, of your invention, include any graphics, notebook pages 
or other material necessary to understand your invention. 



Introduction 



In the following, the discussions are based on Description Schemes that are expressed in MPEG-7 
Description Definition Language (DDL) [4], which is based on the XML Schema Language (XSL). Note 
that Descriptions Schemes specify the syntax and semantics of the corresponding descriptions, which are 
expressed in XML Descriptions are generated according to the corresponding description schemes; in 
other words, they have to be valid according to the description schemes. 



Definition and Identification of Multiple Hot Spots in an XML Box 

Definition and identification of multiple hot spots in the image is achieved using the Still Region Description 
Scheme [2][3]. The Still Region Description Scheme (DS) is derived from the Segment Description 
Scheme. The Segment Description Scheme is used to specify the structure of spatial and temporal 
segments of visual data such as images and video in general. Segments can be decomposed into other 
segments. The Still Region Description Scheme is used to specify a spatial type of segment in still images 
or single video frames. 

The Segment Description Scheme (DS) and the Still Region DS, as expressed in XSL, are as specified 
follows (parts of the syntax and semantics specification that are particularly relevant to hot spots are 
highlighted by gray; In some cases, notes are added to the semantic definitions in order to clarify the 
usage in this particular application; In other cases, semantics of some of the elements are skipped for the 
sake of simplicity): 



<!— #####################^^########## ############### 

<! — Definition of "Segment DS" > 

<!-- ###################^^###################### ##### ._> 

<!-- Definition of datatype of the decomposition' --> 
<simpleType^_ nazne= "Decomposi tionDataType " base= "string" > 

enumeration valuer" temporal^? > 
enumeration value- " spatio-temporal " /> 
enumeration value="MediaSource" /> 
</simpleType> 



<!-- Definition of the decomposition 
<complexType name- " SegmentDecomposi tion" > 

element re f — " Segment " minOccurs="l" maxOccurs= "unbounded" /> 

<at tribute name= " Decomposi tionType " 
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type="mds : DecompositionDataTypeV use= " required" /> 

<attribute name™ "Overlap" type= "h oolean" use= "default " 

value=" false"' / > 
<attribute name=';,6ap " type= "boolean" use= " default " 
valuer" false" /> \ 
< / c omp 1 exTyp e > / 
<element name = 11 Segment " type= "mds : Segment " /> 

<!-- Definition of the Segment itsel/ --> 
<complexType name- " Segment " abstract^ " true " > 

<element name==" Media Information" type="mds : Medialnf ormation" 

r _ _ _ mJnOccurs-^O" max6ccurs= " 1 " /> 

^m^s^^^m^^^^^^m^ / ' — — — — 

m 

name=" UsageMetalnf ormation" 




/ 



<element t 
type="mds : UsageMetalnf ormation" 

^ ^ w ^jnir\occurs==^' 0 "/ maxOccurs= " 1 " / > 

<element name- "Miatchin^Hint " '^"^^ « 
minOdcurs/f"0" maxOccur s = " unbounded " /> 

<element name= " Pcxint^f View" type= "mds : PointOf View" 
minOc^u^s= " 0 " maxOccur s = " unbounded " / > 

<element > / name= " SegmentDecomposition" 

type="mds : SegmentDecompositiom" 

^minOg ( cHrs='' 0 " maxOccur s = " unbounded M /> 

<at tribute nam^= " hrfe f " type= "ur i Refer ince us e = " optional " /> 
/ name=f " idref " 



<attribut 
use= " optional " /> / ^ 

</complexType> / \ 
/ \ 
Semantic of the SegmentDecomposition DS:\ 



type= " IDREF " _ rrfTyni ,, , ■ Jim [TTl 



Name 



Definition} 



SegmentDecomposition Decomposition of a segment into one ore more segments 



Attribute, wHjch specifies the decomposition type of a segment. 
Boolean, whijch specifies if the segments resulting from a segment 
decom(Dosi^ This attribute is optional. 



Decompo s i t i onTyp e 



Overlap 



Gap 



Segment 



Boolean, whicfi specifies if the segments resulting from a segment 
decomposition leave gaps in time or space. This attribute is optional. 



Set of (Sub-)s$gments that form the decomposition 



Semantic of the Segment DS: 
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Name 

Segment 



Definition 



Abstract structure, which represents a fragment or section of the 
AV content. For example, a segment could be a region in an 
image or a moving region in a video sequence. 



id 




Dec omp o s i t i onDa t aTyp e 



Identifier of a video segment (p^lEl^^ 



Medialnf on nation 



Datatype defining the kind of segment decomposition. The 
possible kinds of segment decomposition are spatial, temporal, 
spatio-temporal, and media source. 



Media information related to/the segment and is descendants 



UsageMetalnf ormation 



Qreation Meta infon^dti^ 
descendants^ (Noteil^ 
segments, such *as-;^ 

below;) 7 / - ' " 

Usage Meta inform^tibn related to the segment and is 
descendants / 



SegmentDecomposition Decomposition of/he segment into sub-segment s. 



Textual ann <?taton:^d^ 
annotations 



<!— ############# 
<!-- Definition of 
<!-- ############# 

< element 
eguivClass= " Segment " / > 

< c omp 1 exType 
der i vedBy= " ext ens ion " > 

<element ref = 
<element ref = 
<element ref= 
<element ref= 

<element ref= 
<eiement ref= 
<element 
maxOccurs="l" /> 

<element ref= 
<element ref= 
<element ref= 



"StillRegion DS" \J \ — > 



name="St.-i "LIRegic/n" 




type= "mds 

base= "mds : Segment " 



ColorQuantizat 
DominantColor " 
"_Co 1 or Hi s t ogr am 
iox 



name - " S t i 1 lhegi on "/ 

"ColorSpace" minOccurs/= " 0 " maxOccurs- " 1 " /> 

on" m/nOccurs=" 0" maxOccurs^" 1 " /> 
minOdturs= " 0 " maxOccurs- " 1 " /> 
minpccurs= " 0 " maxOccurs = " 1 " /> 

. . . . &o<^u:r^^ 

RegionShape " m^6qfcurs= ~^Q"^1^6ccnr^"l ; '/> 
ContourShape" n^n6ccurs= " 0 " maxOccurs^ " 1 " /> 

ref ="ColorStructureHistogram n rninOccurs= " 0 " 

"ColorLayout" iainOccurs = M 0" maxOccurs= n 1 " /> 
'TompactColorym^iOccurs^' 0" maxOccurs= " 1 " /> 
,, Homogeneous , T^xtu^e ,, minOccurs= " 0 " maxOccurs^ " 1 " /> 
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<element ref="TextureBrowsing" painOccurte^ " 0 " maxOccurs= " 1 " /> 
<element ref ~ " EdgeHis togram" mj&Occurs^ 0 " maxOccurs=" 1 " /> 
<attribute name- " SpatialConnecti'kty " tvoe = "boolean" 

use=" required" /> / \ 

<!-- Restriction of refType jLo StiilReg^on DS --> 

<attribute name^ " idref " / cype= " IDREf " ref Tvoe = " StillReaion" 
use= "optional" /> / * " " 

</complexType> / \ 



Name 

StillRegion 



Definition 



SpatialConnectivity 



Set of pixels frpm an image or a frame in a video sequence. Note 
however, that po motion information can be used to describe a still 
region. Still imjbges can be natural images or synthetic images. A 
still image is aj particular case of still region. The pixels do not need 
to be connected (see the SpatialConnectivity attribute). 



ColorSpace 
ColorQuantizatiorT 



Boolean whicf| specifies if a still region is connected in space, i.e 
connected pix els. 



Description of (the color space (jsed for the color Ds and DSs of the 

still region (see the Visual part/of the standard [5,61) 

Description of Sthe color quantisation used for the color Ds and DSs 
of the still region (see the Visual part of the standard [5,61) 



DominantColor 
Color/Histogram 



Description of fhe DominantColor of the region (see the Visual part 
of the standard, [5,61) 



ox 



Description of tye color histogram of the region (see the Visual part 

colorfdesdnptio^ 
Descnptipnfof a:to 

usea^b de^ribe^ region^ we a^ui^ 

tt^a^o^ and straightforwafcl 

geom^tr^f or the: fpts^ok) : (see : the Visual part of the stanBard 

saw \? ' - - 

Description of the fegion shape (see the Visual part of the standard 

JZM 4 

Description of the^reigion shape (see the Visual part of the standard 

— — [5,6]) / I 

ColorStructureHisto 
gram 



RegionShape 
ContourShape 



ColorLayout 



(see the Visuaj/part <^f the standard [5,6]) 



CompactColor 



(see the Visyfal part o\ the standard [5,61) 



(see the Visual part oft the standard [5,61) 



HomogeneousTexture (see the Visual part of '{the standard [5,61) 



Textur eBr ows in g 
EdgeHis togram 



(see the/Visual part of the standard [5,6]) 
(see th/ Visual part of the standard [5,6]) 



To summarize: 
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a. Using the above specifications, the hot spots in a JPEG image rriay be described as spatial 
segments. / / 

b. The descriptor BoundingBox is used to define the locations/and dimensions of (multiple) hot-spot 
regions. j / 

c. Each region is identified by an id, which is unique in thels^ope defined by the XML box. 
Next, we will discuss the use of the CreationMetalnformatfob DS and the Annotation DS to attach URLs 



and/or textual annotations with the hot spots 



Embedding of Textual Annotation/ Voice Annotation and URLs in an 
XML Box 

In the following, we address embedding of textual annotations, voice annotations and/or URLs to hot spots 
that may be defined and identified using the XML descriptions generated by the Description Schemes 
discussed above. ' 



Embedding of Textual Annotation 

Textual annotation for each one of 
StructuredAnnotation DS that is 
segment can reference the Annotation DS ind 
identifiers. 



the hot spots is implemented according to the 
referenced by the /Segment DS as shown in the above. Each 
vidually and ^t multiplicities identified by their corresponding 



— > 
— > 
— > 



< ! — #######################4######^#### #########t####### 

<! — Definition of StructuredAnnotation DS 

< [ — ########################^#### #! i t#######t############## 

\ ; 

<element name= "TextAnnotation"\ typ^= "mds : TextualDescription" /> 

<element name= " StructuredAnnotaVidn" type= "mds : StructuredAnnotation" /> 
<complexType name= " S true turedAnnot^t ion" > 

<element name="Who" type= M kjs : ControlledTerm" minOccurs= " 0 " /> 
<element name= "WhatOTject " type= "mds : ControlledTerm" 

mxnOccurs= " 0 " / > \\ 

<element name= "WhatAfcfcdon" type= M mds : ControlledTerm" 

minOccurs=" 0"/> / \ 

<element name= " Where" type^ M mdsv ControlledTerm" minOccurs = n 0"/> ■ 

<element name=="When" type= /mds Control ledTerm" minOccurs= " 0 " /> 

^J^SS^J^ " minOccurs= " 0 " / > 



<attribute name="id" type= " ID" /> 
<attribute ref = "xml : lang/ /> 
</complexType> 



The semantics are defined as: 
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Name 


Definition 




TextAnnotation 


Free textual annotation. 


StructuredAnnotation 


Textual free annotation arid description of people, animals, objects, 
actions, plac^fe, time, and/or ouroose. 


Who 


Textual desc 
a controlled i 


Ription of people and animals. May be from a thesaurus or 
vocabulary. ; 


WhatObject 


Textual desc 
vocabulary. 


ription of objects. May be from a thesaurus or a controlled 


WhatAction 


Textual desc 
vocabulary. 


ription of actions. May be from a thesaurus or a controlled 


Where 


Textual desc 
vocabulary. 


jpption of places. May be from a thesaurus or a controlled 


When 


Textual description 6i time. May be from a thesaurus or a controlled 
vocabulary. \ 



Annotation 



controlled voc 
actions; 



jlary. 



id 



Identifier for/an instantiation of the StructuredAnno tation DS 

I \ " 



Embedding of URLs 

URL links for each hot-spot are realized using th4 RelatedMaterial description. The RelatedMaterial 
ds is referenced by the CreationMetainf ormeftion ds that is defined above; Each segment (e.g., 
each hot spot) references CreatioriMetaTnf oriiatioh ds, multiple times if desired (see the Segment 
ds specification above). The RelatedMat|erial/ ds is specified as follows: 

<!-- Definition the RelatedMaterial DS --> 



<DSType name= "RelatedMaterial "> I 
<attribute name=" id" datatype" 
<at tribute name=" Master 



"ID"/> 



" dat^type=" boolean" def ault= " true " 
jt A_ . . required^ " false" /; 



<DSTypeRef type= ,, CreataonMetaInIormation ,, min6ccurs= " 0 " /> 
<DSTypeRef type= "UsatfeMetalnf orroktion'' minOccurs= " 0 " /> 
</DSType> f 



Name 



A 



Definition 




Rel a t edMa t erxa 1 : 



Dsscnpti^ 
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Name 



Definition 



Master 



Boolean attribute/that allows toudentify if the referenced related 
material is the master. 



Me<ii?a : rn : for^.t: fori 



material / 



CreationMetainf ormation The Creatiorji Meta Information description of the referenced 

related material, / 

UsageMetainformation The Usage Meta Information description of the referenced related 
material. 



The MediaType descriptor describes the type of /he media (e.g., audio, web). For a URL link, 
MediaType is "web" and the URL is specified Using t,He MediaLocator. 

To point to related media that is stored in the* same JPEG2000 file, the Media Locator will point to the 
same JPEG2000 file, as will be discussed belovy/ 



Embedding of Voice Annotation / \ 

Now we address embedding of data other/hartV textual annotation or URL links to the hot spot regions. An 
example of such data is audio for voic^annotMph. Voice annotation is realized by specifying the media 
type in the Related Material description as "aiiflio\ , 

We now address the specification of the MIME type/fc/rmat of the audio data and the encapsulation of data 
itself within a JPEG2000 file. Our method given\below is applicable to any binary data (e.g.. an executable 
computer program, where MediaType may beWecified as ^'executable code" ), and is therefore not 
limited to audio and voice annotation. It makesAse of the XML box mechanism in the JPEG2000 file 
format. The media data is stored within the XML box; we refe/ to this mode of media storage as "in-line 
media". \ i ^ K , x . / v 

's that will be utilized to specify MIME-type/Format and actual 
is achieved by modifying the current MediaLocator specification in 
ecification by incorporating inlineMedia description into the 



We first define the inlineMed 
carriage of the inline media -data. 
MPEG-7, We modify the current 
MediaLocator, as we show bel< 



The following datatype defines 
using two ASCII characters. This 
by 2 and conversion between 
decoding/playback. An alternative 
a storage expansion factor of 1 .5 

<simpleType name- "hexB* 

<encoding value="tf 
</simpleType> 



3t3 



6 binary string, e.g. "98A34F10C5 ,, , where each byte is encoded 
e 'uuencoding' the binary data. Note that storage size is multiplied 
ictual binary stream and its XML encoding is necessary before 
>64 encoding is also allowed in XML Schema Language, leading to 
of 2. 




" base=" binary" > 
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InlineMedia is then defined as follows, where the format of the media stream is indicated by a choice of 
either a MediaFormat DS or a FileFormat (MIME-type) identifier, as shown below: 



< c omp 1 exTyp e name = " InlineMedia " > 
<choice> 

<element name= "MediaFormat " type="mds : MediaFormat " /> 
<element name="Fil\/ormat " type="mds : ControlledTerm" /> 
</choice> 

<element name =" Med iaData^" &ype="hexBinary" /> 
</complexType> / / 



■r 



(A 



InlineMedia description contaififf-the data itself. 



The MediaFormat DS is specified in MPEG-7 as follows: 

<!— ################################################ „> 
<!-- Definition the MediaFormaV DS --> 

< c omp lexType^ name = "MediaFormat "> 

<element name= " System" " type= " irasTcon t r'o 11 edTerm M 'minOccurs 
<element name = "Medium" type= "nAs :C ontr oil edTerm" minOccurs 
<element name= "Color " type = "mGs\ Control ledTerm" minOccurs= 
^element name= " Sound" type= "Ads :V:ontrol ledTerm" minOccurs^ 
<element name= " FileSize " tWe="n®nNegative Integer" minOccu 
< element name = "Length" type£ "mds :\Time Point " minOccurs= " 0 " / 
< element name= " AudioOnannels\" type= "nonNegative 

min0ccurs= " 0 " /> / \ 

<element name = "Audi oLangi^age" type= "language" minOccurs="0 
<attribute name="id" typ£="ID"/> 
< / complexType> 



="0"/> 

="0"/> 

"0"/> 

-0"/> 

rs="0" /> 

> 

Integer " 
"/> 



Name 



Definition 



MediaFormat 



Description of the storage format of the media 




id 



Identification of the\instar>6e of the media format description 



The^fUeforrn^or: 



^ttief^^ti^iistaD^ 



System 



The video system ofttl>6 AV content (e.g., PAU NTSC) 



Medium 



The medium on whi 
DVD). 



the AV content is stored (e.g., tape, CD, 



Color 



The color domain of the AV content (e.g., color, b/w, colored). 



Sound 



The sound domai 
dual). 



e AV content (e.g., no sound, stereo, mono, 



FileSize 



The size, in byte: 



file where the AV content is stored. 



Length 



The duration of the AV content. 



AudioChannels 



The number of j^udio chanhels in the AV content 
"Hi - 



AudioLanguage 
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Finally, the existing MediaLocator in MPEG^ (Documents N3410 and N3411, May 2000) is extende d bv 
adding the InlineMedia DS as follows./ ~ 

<complexType name- " Mediate a y6r"> 

<choice-> \ / ^v&l- 

I <sequence> W — 

V"' <element /tame= "MediaURL " type= "mds : MediaURL " /> 

<element/ \ name= "MediaTime " type = "mds : Medial i me " 

minOqfcurs^"0 M /> / \ 

</sequence> / X ^ 

LaTi'me" 



^ — <element nami- "MediaTime " type= "mds : MediaTime " /> 4 



</choice> 
< / c omp 1 exType> 



<MediaLocator> 

<InlineMedia> 




An example instantiation of the MeEiaLocator ds becomes: 



/ 

<FxleForma^>mp^</FileFormat> 

<MediaDat ? l >98Ai4F10C509453 8AB93 87362522DA3</MediaData> 
</InlineMedia> / \ 

<MediaLocator> ^ 

In summary, this method will ieoj/re the instantiation of MediaType and MediaLocator DS (as defined 
above) in the RelatedMateA&l DS. 

An alternative Imblementatioh: 

f r 

An alternative implementation assumes that the media data can be placed at any arbitrary location in the 
JPEG2000 file. lathis case/ the MediaLocator is alternatively modified as follows: 

<complexType |iame= " ^ediaLoca tor " > 
<choice| 

*<|equejlce> 

^element name = "MediaURL " type = "mds : MediaURL " /> ■ 
^element name = "MediaTime " type=:"m ds :MediaTime' 



min0ccurs=" 



min0ccurs= " Or / 



*cfuence> 

<element name^jle.diaTTKT ll ^type== ,, mds : M ediaURL U I> _ 
" <element 



Qna me== "ByteOf f set *). l -r- -/^^B= 9 rior^&gaLtivelntege 



</choidfe^ 
</complexType> 

In this case, the ; 
MediaFormat DS. 



equence> 

<^iement name =" MediaTime" typ4= "mds : MediaTime " /> 



1 




&RL points to the JPEG2O0O file itself. The format of the media ^pc^^f^^Jj; 

^VV^' zx^ ~ ****** 




/ 




— m — ^ ^ >^ ^ 
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1 1 . What other embodiments or examples are there of your invention? 

In another embodiment, the media data is included in a UUID' box in the JPEG2000 file, in this 
embodiment MPEG-7 description schemes are used in their current f pirns. 

The main issue here is the specification of the UUID of the UUID bqk, and hence linking the XML Box to 
the UUID Box. According to the JPEG2000 FCD, the UUiD specifies/identifies the vendor specific format 
of the contents of the UUID Box. URL referencing extensive information about this format can be included 
in the JPEG2000 file using the UUID Info Box mechanism [1]. 

In this embodiment, the UUID box is implicitly referenced from the >CML box via the MediaFormat DS. 
Regarding the format of the audio file for voice annotation, it is prefspecified (e.g., published) by the 
vendor as a part of vendor's format specification for its UUID box./ 



<!— Definition the MediaProfile DS i / > 

<DSType narne= "MediaProfile " > i / 

<attribute name= " id" datatype- " ID" /> : / 
<DSTypeRef type= "Medialdentif ication"/> / 

<DSTypeRef type= B MedTaCodTng^"* minOccurs^ " 0 " maxOccurs= " * " /> 

<DSTypeRef type = "Medialnstance " minOccur£= tt 0 " maxOccurs= " * " /> 
</DSType> / 



Name 


Definition jf 


MediaProfile 


DS describing one profile of the media being described. 


id 


Identification of the instance of the MediaProfile description. 


Medialdentif ication 


Identification of the master media profile. 




Be^nption^ 


MediaCoding 


Description of the cocJinq parameters of the master media profile 


Medialnstance 


Identification and thd localization of the master media profile 



i— #################################### ### ^ ######## „ : 

Definition the Medialnf ormation bs \ — : 



< ! 



###################^ „ 



/ 

cDSType name= "Medialnf ormation" > / 
<attribute name="id" datatype^ " ID"// > 



<DSTypeRef type= 
:/DSType> 



'MediaProfile" maxOccurs=" * " />. 
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Name 


Definition / 




The Medialnformation Dj§ contains one or more MediaProfile DSs. 
cd^njvieaidiniormaiion uo is related to one reality. For example, a 
concert may have been recorded in audio and in audio-visual media. 
Afterwards each media rnay be available in different format, e.g. the 
audio media in CD, andrthe audio-visual media in MPEG-1, MPEG- 
2, and MPEG-4. This .wi I imply four MediaProfiles for the same 
reality. / ; 


id 


Identification of th^instance of the MediaProfile description. 


MediaProf ile 


DS describing ope profile of the essence being described. 



In this alternative embodiment when the Media locator within the Related Media description points at 
the JPEG2000 file itself via MediaURL, the clie/it application implicitly knows that the related media is 
contained in a UUID box within this same file containing the XML box. The UUID is referenced throuqh 
Media Format description. / 

The application will then locate the UUID bo/with the matching ID in the file and read its contents The 
format of the audio media (eig., mp3) that is contained in the UUID box may be specified a priori by the 
owner of the UUID format. (Or it can be published and referenced using the mechanism of the UUID Info 
Box in the JPEG2000 file format.) 

The mechanism for referring! to the JPEG2000 file itself and the UUID from the XML box is summarized 
below, using the current MPEG-7 description schemes and their hierarchical structure: 

Related Mate rial 

MediaType 
Aiidfq 

MediaLocator 

liSii^ggfii 

Medialnformation 

MediaProfile 
/ MediaFormat 

/ Mill 



$£The UUID Box Format: Storing Data in|the UUID Box 



The XML box is equipped by a mechanism to refer to the UUID box that contains the data, as descrBOT 
above. A format needs to be specified for the UUID box in order to organize the data within and associate 
the data with different regions and different' media types. This format will be vendor specific and identified 
by the UUID. / 



The following format for the UUID box/s one possible example. It assumes that all the embedded data is 
stored in one single.UUID box, provided that the data are within the same file. Data associated with 
different regi ons are identified according to their corresponding region ID. Types of data are also specified 
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The Region Data Length is included to minimize parsing during navigation amongst different regions as 
the user interacts with the image. / 

The Media Data Length is included to enable rapid navigation of data embedded within the same region. 



The following table includes the elements jof the UUID box (in particular, the data part of the UUID box) 
ordered in a sequential manner. The first fentry is the ID; of the UUID box. The rest belongs to the data 
portion of the UUID box. 



UUID Box Format 


Comments 




ID 


The ID of the \ 
box is specifie 
Medial nformat 
description ret 
RelatedMateri* 
the XML box 


articular Ui/JID 
i by the / 
on/Mediafformat 
fenced irj/the 
I description in 


Region ID 


Matches the IC 
Region (Hot Sf 
the StillRegion 
the XML box 


>l of the Still 

)bt) described by 

description in 


Region Data Length 


Total length of d^ta associated 
with this region | / 


Media Type 


Media Type corresponds to the 
value Of the MedlaType 
descriptor in the \ j 
RelatedMaterial description in 
the XML box (it miV be 
mapped to a binary code in the 
UUID box) A 


Media Data Length 


.* i 
/ \ 


Media Data 






/ \ 


Media Type 


! \ 


Media Data Length 


i \ 


Media Data 


I \ 
f \ 




i \ 
I \ 


Region ID 


/ \ 


Region Data Length 




Media Type 


/ \ 


Media Data Length 




Media Data 






—f \- 




/ \ 

! 
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t should be noted that the Region ID in the above table may be generalized to an "Object ID" The Obiect 
ID may then refer to any XML object, i.e., any description that is identified by an ID. In that case using the 
invention, a Person Description may have an audio annotation associated with it, or a Summary' 
Description may have executable software associated with it>MPEG-7 supports identification of XML 
descriptions using unique identifiers. / \ 



Summary of the Use of MPEG-7 Tools in Connection with 
the Alternative Embodiment 



ie JPEG2000 File Format Structures in 



Embedded 
Information 


MPEG-7 
Tool 


yPEG2000 FFJ Structure 


Hot Spots (bounding 
box, ID of boxes) 


Still Region DS 


XML Box / 


Textual Annotation 


Annotation DS 


XML Box / 


URL Link 


Related 
Material DS 


XML Box/ 


AudioA/oice Annotation 
Data 


Related 
Material DS 


XML Box: indicates Media Type as "Audio" and contains 

r^fer^nce to the UUID Box; 

UjJfD Box: contains the audio data. 


Executable Code 


Related 
Material DS 


XML Box: indicates Media Type as "executable" and contains 
reference to the ID of the UUID box containing the 
ex^ahle); 

UytJlD Box: contains the executable 







A two-Level Implementation 

A smart server may first serve the client the\imaoi data, the hot spot locations, and the type and format of 

l^hc T ° C,ated W J th the h0t SpotS - J Pf data that is of interest t0 the "ser may be delivered 
subsequently upon user's request at the secqKq' level. 
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(Indicate the extent of your search, attach search pages, if any): 

The following prior art has been referenced in the main body of the disclosure. 




e_rsion 



1. ISO/IEC JTC 1/SC 29/WG1 N1646: JPEG 2000 Part I Final Committee Draft! 
March 2000. Jl 

2. MPEG-7 Multimedia Description Schemes, Experimentation Model (XM) V SyRPtf^PATENTOEPARTMENTf 

Geneva, May 2000. 

3 ^OOQ 0 " 7 Muldmedia Descri P tion Schemes, Working Draft (WD) V 3.0, N34 1 1, Geneva, May 



Inventor Signature 



Date 



Inventor Signature 



Date 



Witnessed & Understood By 



Date 



Inventor Signature 



Date 



Witnessed & Understood By 



12:09 PM June 28, 2000 



Invention Disclosure 



disclosure2.doc 



on*M-tr L-auuicaiui ui uchuci, mo. OUM OOnilU^ 'iai raye l / OT I / 

„ 575Q NW Pacific Rim Bouleva 

Camas, Washington 98607 - - g(j^ Q 0 c(<et N~ : 



4. MPEG-7 Description Definition language '(DDL) WD 3,0, N3391, Geneva, May 2000. 

5. MPEG-7 Visual Part of XM 6.0, N3398, Geneva, May 2000. 

6. MPEG-7 Visual Part, Working Draft (WD) V 3,0, N3399, Geneva, May 2000. 

7. US6070167: Hierarchical method and system for object based audiovisual descriptive tagging of 
images for information retrieval, editing and manipulation. 

8. SLA107: Image file format for embedding object based information to images 

9. SLA141: Information management system 

10. SLA237: Method for specifying preferences and usage history of audiovisual information users 

11. SLA238: Method for specifying summaries of audiovisual content 

12. SLA251: Method for specifying user descriptions allowing the specification and identification of 
multiple user preferences and history for different usage conditions 

13. SLA317: Information Management System. 



f 









Inventor Signature 


Date 




Inventor Signature 


Date Witnessed & Understood By 


Date 


Inventor Siqnature 


Date Witnessed & Understood By 


Date 


12:09 PM June 28, 2000 


Invention Disclosure 


disclosure2.doc 




SHARP LABS PATENT DEPARTMENT 



